MultiBIC: an improved speaker segmentation technique for TV shows
نویسندگان
چکیده
Speaker segmentation systems usually have problems detecting short segments, which causes the number of deletions to be high and therefore harming the performance of the system. This is a complication when it comes to segmenting multimedia information such as movies and TV shows, where dialogs among characters are very common. In this paper a modification of the BIC algorithm is presented, which will reduce remarkably the number of deletions without causing an increase in the number of false alarms. This modification, referred to as MultiBIC, assumes that two change-points are present in a window of data, while conventional BIC approach supposes that there is just one. This causes the system to notice when there is more than one change-point in a window, finding shorter segments than traditional BIC.
منابع مشابه
Visual speech segmentation and speaker recognition for transcription of TV news
This paper is about a method for visual segmentation of TV news. The TV news shows are segmented according to the visual stream from the video TV recordings in this method. Human faces are found in the single visual segments with the help of the fast algorithm for face detection. The found faces are compared with the visual GMMs, that have been trained from the video picture of the single broad...
متن کاملAn Improved Pixon-Based Approach for Image Segmentation
An improved pixon-based method is proposed in this paper for image segmentation. In thisapproach, a wavelet thresholding technique is initially applied on the image to reduce noise and toslightly smooth the image. This technique causes an image not to be oversegmented when the pixonbasedmethod is used. Indeed, the wavelet thresholding, as a pre-processing step, eliminates theunnecessary details...
متن کاملSpeaker Role Recognition on TV Broadcast Documents
In this paper, we present the results obtained by a state-ofthe-art system for Speaker Role Recognition (SRR) on the TV broadcast documents issued from the REPERE Multimedia Challenge. This SRR system is based on the assumption that cues about speaker roles may be extracted from a set of 36 low level features issued from the outputs of a Speaker Diarization process. Starting from manually annot...
متن کاملAn Improved Model-based Speake
In this paper, we report our recent work on speaker segmentation. Without a priori information about speaker number and speaker identities, the audio stream is segmented, and segments of the same speaker are grouped together. Speakers are represented by Gaussian Mixture Models (GMMs), then an HMM network is used for segmentation. However, unlike other model-based segmentation methods, the speak...
متن کاملNeural Network-Based Learning Kernel for Automatic Segmentation of Multiple Sclerosis Lesions on Magnetic Resonance Images
Background: Multiple Sclerosis (MS) is a degenerative disease of central nervous system. MS patients have some dead tissues in their brains called MS lesions. MRI is an imaging technique sensitive to soft tissues such as brain that shows MS lesions as hyper-intense or hypo-intense signals. Since manual segmentation of these lesions is a laborious and time consuming task, automatic segmentation ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010